Weighted percolation on directed networks 
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We present an analysis of the percolation transition for general node removal strategies valid 
for locally tree- like directed networks. On the basis of heuristic arguments we predict that, if the 
probability of removing node i is pi , the network disintegrates if pi is such that the largest eigenvalue 
of the matrix with entries Aij(l —pi) is less than 1, where A is the adjacency matrix of the network. 
The knowledge or applicability of a Markov network model is not required by our theory, thus 
making it applicable to situations not covered by previous works. We test our predicted percolation 
criterion against numerical results for different networks and node removal strategies. 



There has been much recent interest in the struc- 
ture and function of complex networks [lj. One as- 
pect that has received considerable attention is the re- 
silience of networks to the removal of some of their nodes 
0, B H H IE S H Hi- This problem is related, for exam- 
ple, to the robustness of transportation and information 
networks to disturbances like random failures or targeted 
attacks, or to the resistance of biological networks to the 
action of drugs. Another related problem is determin- 
ing the threshold for epidemic spreading [Icj . An im- 
portant objective is to determine what characteristics of 
the structure of the network determine the proportion of 
nodes that can be removed before the network disinte- 
grates. 

A model often considered is one in which nodes are 
removed from a network of size N with a uniform prob- 
ability p. In the large N limit, for probabilities less than 
a critical value p c , there is a connected component of the 
network of size of order N (the giant component), and 
for values of p larger than p c , there is no connected com- 
ponent of size of order N 0, S, & IE @, 0, 3- The 
critical probability p c at which this percolation transition 
occurs has been the subject of several theoretical works, 
and various approximations have been put forward. 

In what follows we define the in and out degrees of 
a node i by d? ut = Ay and df = Y* =1 A Tl . 

Here Ay is the network adjacency matrix; Ay = 1 if 
there is a directed link from i to j and otherwise. 
If A = A T the network is said to be undirected and 
d° ut — df 1 = dj. For undirected, degree uncorrelated 
networks (the number of connections per node for neigh- 
boring nodes is not correlated), Cohen et al. have shown 
Q that the critical probability is approximately given by 
(1 — p c ) [((d 2 )/{d)) — l] =1, where (■) denotes an aver- 
age over nodes. Reference 0] treats the case of undi- 
rected networks with correlations for degree Markovian 
networks, i.e., networks in which all nontrivial correla- 



tions are captured by the probability P(d'\d) that a ran- 
domly chosen link from a node with degree d is connected 
to a node with degree d' . Reference [7( generalizes the 
result in Ref. Q and obtains a critical value of p c given 
by (1 — f> c )A = 1, where A is the largest eigenvalue of 
the matrix with entries Cdd' — {d' — \)P(d'\d). Other 
works on undirected networks have extended the Marko- 
vian approach to include the effect of clustering (e.g., 
often present in social networks); in particular Ref. [9( 
presents a general analysis for the case of weak cluster- 
ing. 

Reference [1] studies the percolation transition in di- 
rected degree Markovian networks. The types of compo- 
nents studied are a strongly connected component (SCC) , 
defined as a set of nodes such that every node in the 
SCC is reachable from any other node in the SCC by a 
directed path, its associated in- component (IN), defined 
as the set of nodes from which the SCC is reachable by a 
directed path, and out-component (OUT), defined as the 
set of nodes reachable from the SCC by a directed path. 
Notice that there might be several such components in 
a particular network. Of interest is the largest strongly 
connected component which, if its size is of order N, is 
called the giant strongly connected component, GSCC. 
The out and in components of the GSCC are denoted 
GOUT and GIN. 

It was found in Ref. 0] that as the probability of 
node removal p is increased, GSCC, GOUT, and GIN 
disappear at the same critical value p c . This value was 
found to be determined by the largest eigenvalue of a ma- 
trix expressed in terms of P (y'|y) and Pb (y'|y) where 
y = {d p n ,d° ut ,d p l ), and df, d° p ut and d% are the num- 
ber of purely incoming, purely outgoing and bidirectional 
edges for a given node (i.e., an edge Ay = 1 is purely out- 
going from node i if A oi =0). P D (y'|y) and P b (y'|y) are 
the probabilities of reaching a node of degree y' from a 
node of degree y by following an outgoing and a bidircc- 
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tional edge, respectively. 

One of our aims in this paper is to remove the need 
for the applicability and knowledge of a Markov network 
model. In order to do so, in this paper we will focus on 
a class of directed networks that are locally tree-like in 
the sense that they have few short loops. More precisely, 
we assume that for each node i and not too large L, the 
number of different nodes reachable by paths of length L 
or less starting at node i is close to the total number of 
paths of length L or less starting from node i. In par- 
ticular (L — 2) we will assume that bidirectional edges 
are absent or negligible in number. Under this assump- 
tion, y = (d m , d out , 0), and the matrix in Ref. [§] whose 
eigenvalue determines the percolation transition reduces 
to 

= (d° ut yp(z'\z), (i) 

where z = (d m ,d out ). We note that our locally tree- like 
condition for directed networks is analogous to assuming 
negligible clustering. 

In many situations the node removal probability is not 
a constant. For example, airports might have different 
security measures, or differ in their vulnerability to an 
attack or weather related shutdown due to their geo- 
graphical location. Also, we have noted recently 11 1 
that a measure of the dynamical importance of nodes 
is proportional to ViUi, where u and v are the right and 
left eigenvectors corresponding to the largest eigenvalue 
A of the adjacency matrix of the network, A: Au = \u 1 
v T A = Xv T . Thus, a potential removal strategy (to be 
used in an example later in this paper) is that in which 
nodes are removed based on the value of WjUj. More gen- 
erally, we would like to study the effect of node removal 
strategies that assign a probability pi to the removal of 
node i, and we refer to this problem as weighted percola- 
tion. (Other previously considered possibilities are that 
highly connected nodes are preferentially removed from 
the network, e.g., p% is proportional to the degree of node 
% Q, or to a power of the degree of node i 

Our objective is to present a simple heuristic method 
for treating general weighted percolation removal strate- 
gies (i.e., general pf) on directed networks without the 
need for a Markovian network model. While we do not 
use the Markovian assumption nor a specific node re- 
moval probability, we assume a locally tree-like network 
structure (we will discuss the validity of this assumption 
below), and we also require knowledge of the network 
adjacency matrix A. We find that the network disin- 
tegrates, as defined by the disappearance of the giant 
connected components, when the node removal strategy 
is such that the largest eigenvalue A of the matrix A with 
entries A^ — Aij (1 — pi) is less than 1 . 

To obtain the above result we adapt the mean field ar- 
guments given for example in 1 and to our case. Con- 
sider first the disappearance of the giant in-component 



GIN. Let r\i be the probability that node i is not in the 
giant in-component GIN. Node i is not in GIN either if 
it has been removed (with probability pi), or if it has not 
been removed and none of its out-links point to nodes 
in GIN. Consider two such nodes j\ and f% that i points 
to. We argue that it is reasonable to make the approxi- 
mation that whether j\ belongs to GIN is independent of 
whether j2 belongs to GIN. Whether ji is in GIN depends 
on whether the nodes it points to are in GIN, which de- 
pends on the nodes they point to, and so on. Our locally 
tree-like assumption implies that the nodes that can be 
reached from ji by a short path are essentially indepen- 
dent of the nodes that can be reached from ji by a short 
path. Based on this independence assumption, r\i is given 
by (recall that A^ = or 1) m = Pi+(l-pi) Hf =1 {r]j) Aij : - 
This equation always has the trivial solution rji = 1. The 
presence of a giant in-component requires a solution for 
which the expected size s = TV — Vi is positive. Set- 

ting r\i = e~ Zi and assuming < z% and Y^jLi ^-ij z j ^ 1j 
we obtain the approximation Z{ = Ylf—i Ajjjl — pi)Zj. 
When pi — 1 (i.e., all nodes are removed), the only so- 
lution is the trivial solution = 0. As we decrease the 
pis, a nontrivial solution (corresponding to a giant in- 
component) first appears when the largest eigenvalue A 
of the matrix A with entries A^ — Ay(l — pi) is 1. Note 
that, as required, we can satisfy 77, < 1 since the compo- 
nents Zi of the eigenvector corresponding to A are, by the 
Frobenius theorem nonnegative. Applying the same 
reasoning to the out-component GOUT we find that it 
appears when the largest eigenvalue of the matrix with 
entries B t j = Aji(l — Pj) — {A T )ij is 1. Since the trans- 
pose of A and A have the same spectrum, the giant in 
and out-components appear simultaneously. 

The above can also be understood by the following 
heuristic argument. Our previous discussion applies not 
only to GOUT (GIN), but more generally to sets gen- 
erated by repeatedly following outgoing (incoming) links 
starting from a given node. Therefore, one can estimate 
the size of such sets and locate the transition as the point 
at which one of them has macroscopic size. In doing so, it 
is essential not to overcount the number of nodes, and it 
is here where our assumption of locally tree-like network 
structure allows us to simplify the problem. The number 
of directed paths of length m starting from node i can be 
estimated using this assumption, for not too large m, as 
the sum of the components of the vector A m e l , where e l 
is the unit vector for coordinate i. If the largest eigen- 
value of A is larger than 1, the number of paths of length 
m grows exponentially with m for some starting node 
i. Under our assumptions, these paths traverse different 
nodes, and thus the out-component of i has large size, in 
agreement with our previous result. 

In order to motivate the locally tree-like assumption, 
we consider the illustrative case of uncorrelated networks. 
(While our concern in this paper is not with uncorrelated 
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networks, they, nevertheless, provide a useful example 
of why the locally tree-like assumption might be valid 
in some cases.) In particular, we estimate the fraction 
of bidirectional edges. The probability p y that nodes 
i and j share a bidirectional edge is given by = 
d1 ut df l dfdf lt /(N 2 (d) 2 ), where, since (d m ) = (d out ), we 
use (d) to denote either of these averages. In the case 
that the degrees at nodes i and j are uncorrelated, on 
average, ( Pij ) = ((d° ut df) / (N (d))) 2 w (A/TV) 2 , where 
we have used the mean field approximation for the max- 
imum eigenvalue A of A (see below). Not too far above 
the percolation transition A is of order 1, and we thus 
expect the locally tree-like assumption to be valid. 

We next discuss how our results compare to those of 
Ref. Q in the case of negligibly few bidirectional edges 
(i.e., Eq. (TT])). If p i = p, our result for the critical proba- 
bility p c reduces to (1 ~ p c )X — 1, where A is the largest 
eigenvalue of the adjacency matrix A. If the network is 
degree Markovian, and we let 4""' be the average num- 
ber of directed paths of length m starting from nodes 
of degree z, we have ipi m+1) = d out £ a , P(z'|z)^ m) . 
Since, for large m, the number of paths of length m 
grows like A m , we associate to the previous equation the 
eigenvalue problem Am^z = d out J2 Z ' -P^'l 2 )^', where 
Am is the Markovian approximation to A. The previ- 
ous result agrees with Eq. {T]) [the matrices d out P(z'\z) 
and (d out )' P(z'\z) have the same spectrum]. We note 
that, in the absence of degree-degree correlations, we 
have P(z'\z) = d m P(z')/(d), which when inserted in the 
eigenvalue equation yields the mean field approximation 
for the eigenvalue, X mf = (d m d out ) / (d) . 

We now illustrate our theory with two numerical exam- 
ples. The first example (Example 1) illustrates the flexi- 
bility of our approach to address various weighted perco- 
lation node removal strategies, while the second example 
(Example 2) illustrates the point that our approach does 
not require the knowledge or applicability of a Markov 
network model. 

Example 1. For simplicity, we consider uncorrelated 
random networks with degree distributions P(d m ,d out ) 
in which d m and d out are independent and have the same 
distribution P(d), that is, P(d in ,d out ) = P(d™)P(d out ). 
We use a generalization of the method in Ref. [13j in order 
to generate networks with a power law degree distribu- 
tion, P(d) cx d~"> . We choose the sequence of expected 
degrees d\ n = c(i + i Q — l) _ V(7-i) for the in-degrees, 
and a random permutation of this sequence for the out- 
degrees, where i = 1, . . . ,N, and c and io are chosen to 
obtain a desired maximum and average degree. Then, the 
adjacency matrix is constructed by setting Ay = 1 for 
i ^ j with probability d° ut d™ / (N (d)) and zero otherwise 
(An = 0). The ensemble expected value of the result- 
ing network degree distribution is given by P(d m ,d out ). 
(Note that we assume <2° Mt d™ < N(d).) In Fig. [Ha) we 
show, for a N = 2000 scale free network with exponent 
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FIG. 1: (a) Ratio of the largest connected component to the 
number of remaining nodes, and (b) largest eigenvalue of the 
adjacency matrix of the network after node removal as a func- 
tion of the number of removed nodes R, when nodes are re- 
moved randomly (triangles and thick solid line in a) and b) 
respectively), in order of decreasing d° ut d\ n (stars and thin 
solid line), and in order of decreasing dynamical importance 
(boxes and dashed line) (see text) . The percolation transition 
occurs when A — 1 [indicated by the vertical arrows in (a)]. 

7 = 2.5 and (d) — 3, the size of GIN as a function of the 
number of removed nodes R, when nodes are removed in 
order of decreasing d* n d° ut (thick solid line), decreasing 
dynamical importance [ll| (vjUi/v T u) (thin solid line), 
and randomly (dashed line) [14j | . The removal probabili- 
ties in the first two cases are given by pi = 1 if i G S and 
otherwise for a subset S of nodes. (For example, in the 
first case S = {i : d" l d° ut > c? 2 }, and A reduces to the 
matrix obtained by removal of all nodes in A for which 
fjin^out > rf 2 ) yy e a ^ SQ m pj g [JJb) the largest 

eigenvalue A of A which in this case is equivalent to the 
adjacency matrix of the network resulting from the re- 
moval of the nodes. We observe for all three cases that 
the network disintegrates, as predicted, when A = 1 (indi- 
cated with the vertical arrows in Fig.[T](b)). The number 
of removed nodes necessary to disintegrate the network 
is less when removing nodes by dynamical importance. 
Removal of nodes by d\ n d° ut requires somewhat more 
nodes to disintegrate the network, while random removal 
requires removal of the most. 

The simplest Markovian network assumption is that 
correlations between connected nodes depend solely on 
the degree of these nodes and not on any other property 
the nodes might have. While this is a useful analyti- 
cal framework, probably applicable to many cases, it is 
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FIG. 2: Normalized size of the GIN as a function of the 
number of remaining nodes, 1 — p, for Example 2. The plot 
shows ten different realizations of the node removal process, 
and the inset shows their mean 

also likely to fail in other cases (e.g., [Hj]). In our next 
example we will consider a network in which the node 
statistics depends on variables additional to the degrees 
and for which, therefore, a degree based Markovian ap- 
proximation is not a priori expected to apply. 

Example 2. We start with a network generated as 
in Example 1 with N = 10 5 , 7 = 2.5, and (d) 
'.') . Then, we first specify a division of the nodes in 
the network into two groups of the same size, A and 
B, (A\JB = {1,2,. ..,N}, Af]B = 0). We de- 
fine a measure of the degree-degree correlations p = 
(dl n dj ut ) e /{di)g, where (. . . } e denotes an average over 
edges, (Qij)e = J2i,j A i]Qv/ J2i, 3 ^ii: Tne following (an 
adaptation of the method in Ref. [l6|) is repeated until 
the network has the desired amount of degree-degree cor- 
relations as evidenced in the value of p: Two edges are 
chosen at random, say connecting node i to node j and 
node n to node m. If n, m are all in A, the edges are 
replaced with two edges connecting node i to node m and 
node n to node j if {d^d^+d^d^-d^d^-dfd ^) < 
0. If i,j,n, and m are all in B, the edges are replaced 
with two edges connecting node i to node m and node n 
to node j if (d^d™* + dfd° ut - d™d° ut - dfd™*) > 0. 
Otherwise the edges are unchanged. The effect of this di- 
vision is to create two subnetworks, A and B, with pos- 
itive and negative degree-degree correlations. Starting 
from such a network, we successively remove a randomly 
chosen node and compute the size of the GIN relative to 
its initial size. In Fig.[2]we plot this normalized size of the 
GIN as a function of the fraction of remaining nodes for 
ten realizations of the node removal sequence. The verti- 
cal arrow represents the prediction from the eigenvalue. 
Although the transition points of individual realizations 
have some spread, the arrow gives a good approximation 
of their mean (see inset). 

We now discuss the advantages and disadvantages of 
the eigenvalue approach when compared to the Markov 
approximation. As opposed to the Markov approxima- 



tion, the eigenvalue approximation allows the easy treat- 
ment of general node removal strategies ('weighted perco- 
lation'). Furthermore, it does not require the assumption 
that the node correlations depend only on their degree 
and are only to nearest neighbors. In addition, the con- 
struction of the matrix d out P(z'\z) and the determina- 
tion of its largest eigenvalue is in some cases harder than 
the direct determination of the largest eigenvalue of the 
adjacency matrix A. On the other hand, in many cases 
the adjacency matrix of the network is not known, and 
local sampling methods from which an approximation to 
the matrix d out P(z'\z) can be constructed must be used. 
Additionally, the eigenvalue approach is valid only when 
the network has locally tree-like structure. 

In conclusion, we have presented a simple eigenvalue- 
based criterion for percolation on directed networks. Our 
method should be viewed as complementary to previous 
studies in that it does not require knowledge or applica- 
bility of a Markov network model and can treat general 
node removal strategies, but requires knowledge of the 
network adjacency matrix A and only applies when the 
network has locally tree-like structure. 
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